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ABSTRACT 


In  this  thesis  we  extend  several  results  on  prediction 
intervals  that  were  obtained  under  the  assumption  that  the 

sample  observations  are  independent  and  identically 

p 
distributed  as  N(u,a  ).   We  assume  that  the  samples  are 

correlated  with  a  prescribed  correlation  structure  and  show 

that  many  of  the  results  available  for  the  independent  case 

apply  equally  well  for  the  correlated  samples.   The 

correlation  structure  assumed  occurs  in  variance  components 

models  in  Analysis  of  Variance  and  the  results  can  also  be 

applied  to  the  case  where  the  samples  have  an  intraclass 

correlation  (equicorrelated  samples). 
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I.   INTRODUCTION 

A  prediction  interval  is  a  random  interval  that  contains 
the  value  of  a  future  observation  or  some  function  of 
future  observations  and  whose  end  points  are  functions  of 
previous  sample  values.   Such  an  interval  provides  an 
indication  of  the  uncertainty  in  the  future  observations. 
More  specifically,  a  lOOr  percent  prediction  interval  for 
the  value  of  a  future  sample  is  an  interval  that  is  based 
on  a  previous  sample  and  encloses  the  future  observations 
with  probability  r,  independent  of  the  values  of  the 
distribution  parameters,  such  as  the  mean  or  the  standard 
deviation.   A  prediction  interval  needs  to  be  distinguished 
both  from  a  confidence  interval  and  a  tolerance  interval; 
a  confidence  interval  encloses  the  value  of  an  unknown 
parameter  and  a  tolerance  interval  is  an  interval  within 
which  a  specified  proportion  of  the  population  values  will 
lie  with  a  specified  probability. 

In  many  practical  problems,  it  would  be  of  interest  to 
construct  a  prediction  interval  for  the  values  of  the  next 
k  sample  values  from  a  population.   For  example,  if  only 
one  machine  is  available  for  testing  and  we  must  perform 
trials  sequentially,  a  prediction  interval  could  provide 
helpful  information  about  the  total  time  needed  to  complete 
the  experiment  or  perhaps  the  number  of  trials  it  would  be 
possible  to  perform.   Another  application  of  prediction 


intervals  is  in  forecasting  before  a  planned  experiment  is 
completed.   In  an  experiment  where  each  observation  is 
expensive  or  where  they  can  be  made  only  infrequently, 
prediction  intervals  may  be  helpful  in  reaching  a  decision 
on  the  profitability  of  continuing  the  experiment  at  inter- 
mediate points  in  the  experiment.   For  example,  when  the 
experiment  concerns  a  physical  input  or  output,  preliminary 
estimates  of  the  ultimate  amount  of  needed  input  material 
or  of  the  ultimate  storage  needed  for  the  output  might  be 
helpful.   In  other  situations  where  the  random  variable  is 
the  "time  until  occurrence  of  an  event,"  and  where  physical 
limitations  prevent  the  concurrent  running  of  all  planned 
trials,  prediction  intervals  might  provide  helpful  infor- 
mation concerning  the  total  time  until  completion  of  the 
planned  experiment.   Prediction  intervals  are  also  of 
frequent  interest  to  a  typical  consumer  of  one  or  a  small 
number  of  units  of  a  given  product.   Such  an  individual  is 
generally  more  directly  concerned  with  the  future  performance 
of  his  specific  sample  than  in  the  process  from  which  the 
sample  had  been  selected.   A  prediction  interval  to  contain 
each  of  the  values  of  the  sample  would  then  provide  him 
with  an  interval  within  which  he  may  expect  the  performance 
of  all  his  units  to  be  located  with  a  high  probability. 
Based  upon  his  experience  with  a  previous  sample  of  10  light 
bulbs,  a  consumer  might  wish  to  construct  an  interval  which 
would  have  a  high  probability  of  including  the  performance 
values  of  each  of  three  additional  bulbs. 


In  this  thesis  we  derive  prediction  intervals  for  one 
future  sample  observation  as  well  as  simultaneous  intervals 
for  a  specified  number  of  future  sample  observations  when 
the  samples  are  correlated.   These  results  are  obtained  as 
extensions  of  results  due  to  Hahn  [5].   He  derived  similar 

intervals  for  the  case  where  the  samples  are  independent  and 

2 
identically  distributed  as  N(y,c  ). 

In  Chapter  III  it  is  shown  that  Hahn's  prediction  inter- 

val  for  the  standard  deviation  of  a  single  future  sample  is 

valid  even  in  the  case  where  the  sample  values  are  correlated 

and  have  a  multivariate  normal  distribution  with  mean  vector 

u  =  (ysy,y,. . • ,y)'  and  covariance  matrix  V  having  the  following 

structure: 


V     =  I  (   H     +      H')    +    a(    I      -     E    ) 

nxn  nxn        nxn  nxn        nxn 


(1.1) 


where     H     = 
nxn 


hl      hl      hl 
hp     hp     hp 

h3     h3     h3 


hn     hn     hn 


ni 


H'    is   the   transpose   of     H      ,   hi    (i=l,2,3, . . .  ,n)    and 


nxn 


nxn 


a  are  positive  constants,   I   is  an'  nxn  identity  matrix, 

nxn 


and     E     is  an  nxn  matrix  all  of  whose  elements  are 

nyn 
unity. 

Simultaneous  prediction  intervals  for  the  standard 

deviations  of  k  future  samples  are  also  derived  and  examples 

illustrating  the  results  are  provided. 

A  covariance  matrix  with  the  above  structure  occurs  in 

the  study  of  random  effects  models  in  analysis  of  variance. 

If  samples  are  drawn  from  a  normal  distribution  N(y,a.-) 

and  It  is  assumed  that  y  itself  is  normally  distributed 

2 
as  N(n,c   )  then  it  can  be  shown  that  the  sample  values 

have  a  multivariate  normal  distribution  with  mean  vector 

n  =  (Jl  jl  >n  >  •  •  •  5l) '  and  covariance  matrix 


V  = 


a  +a 
y 


It  can  be  seen  that  the  matrix  V  has  the  same  structure 
as  in  (1.1)  by  letting  h1=h2=. . .=hn=a2+a  2  and  a=o2 .  A 
possible  application  of  the  results  of  this  thesis  is  in 
the  following  situation.  From  a  lot  containing  a  large 
number  of  guns  n  are  selected  at  random.  Each  of  these 
guns  is  then  fired  k  times  and  the  resulting  miss  distances 
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from  a  target  are  measured.   Based  on  the  mean  of  the 
measured  miss  distances,  a  prediction  interval  for  the 
miss  distance  for  a  randomly  chosen  gun  may  be  predicted. 

Chapter  IV  deals  with  procedures  for  constructing  a 
prediction  interval  to  contain  a  single  additional  observa- 
tion and  also  with  constructing  a  simultaneous  prediction 
interval  to  contain  all  k  additional  future  observations, 
when  the  samples  are  correlated  and  the  covariance  matrix 
has  the  structure  as  in  equation  (1.1). 
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II.   SUMMARY  OF  KNOWN  RESULTS 

A.   DEFINITIONS  AND  NOTATIONS 

Let  X..,  1=0,1,2 ,3 , ... >k  and  3=1,2 ,3, • . . ,n^,  be  k+1  sets 

of  random  samples  of  size  n.  from  a  normal  distribution 

N(u,a  ).   The  n  samples  for  1=0  are  considered  as  the 

given  sample  and  the  remaining  k  sets  are  future  samples 

for  which  prediction  intervals  are  needed. 

Let 

-     1    * 
1   ni  3=1   1J 

ni 
and     S,2=-f-   I        (X,.  -  X,)2 
1   ni  j=l    1J 


where  1=0,1,2,3. .. ,k  and  j=l ,2 ,3, . . . ,n. , 


B.   PREDICTION  INTERVALS  FOR  THE  STANDARD 
DEVIATIONS  OF  FUTURE  SAMPLES 

It  is  well  known  that  nQSo2/c2  and  n1S12/a2  (1=1,2 ,3. . . ,k) 
have  a  Chi-square  distribution  with  n  -1  and  n^-1  degree 
of  freedom  respectively  and  they  are  mutually  independent. 

Thus,  S.  /S    follows  an  F  distribtuion  with  n.-l  and 


n  -1  degree  of  freedom  respectively,  1=1 ,2  ,3, . . . ,k. 

Therefore,  a  prediction  interval  to  contain  the  standard 
deviation  S.  of  a  single  future  sample  of  ni  observations  is 
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Pr{S0P(n1-l,n0-l;(l-r)/2)Ss  <  S±  <  SoF(n1-l,n0-l;(l+r)/2)!5}  =  r 

(2.1) 


where   P(n±-l,n0-l;(l-r)/2)  and  F(n1-l,n0-l; (l+r)/2) 
are  lower  and  upper  100r#  points  of  F  distribution  with 
n.-l  and  n  -1  degree  of  freedom  respectively. 

A  two-sided  100r%  prediction  interval  to  contain  the 
standard  deviation  S.  of  a  single  future  sample  of  size  n. 
is 

(Pdi.-l^-lsd-r)^)^,  F(n1-l,no-l;(l+r)/2)l5So)      (2.2) 

To  obtain  a  simultaneous  interval  to  contain  the 
standard  deviations  of  k  future  samples  assume  that 
n.=m,  1=1,2,3,. .. ,k,  and  let 

S.2 

max  — «■  =  WT  (K,m-l,n  -1) 

i  so2   L 

si2 

and     min  — ~   =  WQ(K,m-l,n  -1) 
±      g  d  b        o 

o 

The  random  variables  WT(K,m-l,n  -1)  and  WQ (K,m-l,n^-l)  are 
known  as  the  student i zed  largest  and  studentized  smallest 
Chi-square  variates,  respectively,  in  the  statistical 
literature  and  some  tables  [1]  of  the  percentage  point  of 
their  distributions  are  available.   Let  D,](K)m-l)n  -l;r) 
and  DL(K,m-l,n  -1,1-r)  denote  the  upper  100r/S  and  the  lower 
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100(l-r)#  points  of  the  distribution  of  WT  (K.m-l.n  -1) 

L  O 

and  Wg(K,m-l,n  -1) ,  respectively. 
Then 

Prdnax  S^  <  DjjCK.m-l.n  -l;r)SQ2}  =  r 


and 


Pr{min  S,2  >  DT (K ,m-l,n  -1 ,r)S  2}  =  1-r       (2.3) 


A  simultaneous  prediction  interval  to  contain  all  the 
standard  deviations  S  ,S2,S  ...,S   is  given  by 


(Du(K,m-l,no-l;r)J5So,  DL(K,m-l  .i^-l;!-!-)^) 


C.   PREDICTION  INTERVALS  FOR  THE  OBSERVATIONS  IN  A  FUTURE 
SAMPLE 


Let  X-  ,X„ ,X_, . . . ,X  be  the  values  of  n  given  samples 
from  a  normal  distribution  N(u,a  )  and  let  X  +.  ,X  +23X  +  o> 
. . . ,X  +,  be  the  values  of  k  future  independent  observations 
to  be  drawn  from  the  same  distribution.   To  get  a  prediction 
interval  to  contain  a  single  additional  observation  X  ., , 
we  proceed  as  follows; 
Let 

Zl  =  Xn+1  "  *o 

the  expected  value  of  Z   is  zero  and  the  standard  deviation 
of  Z^  is  Z(  ^(0(0,  (rx(i-^) 

13 


,a& 


It  is  easily  seen  that  that  the  standardized  variable 


i     zn  -  0       X  ..  -  X- 

„   _    1         „  n+1 o 


1    pH4 


—         l  h 


and 


s  *  =  i  z  (x.-x  r 

o    n  .  ,    i   o  y 

i=l 

are  independent.   Therefore, 


^FiTT 


)C 


Z    '  X    ,„    -   X 


T   =  X  =      -S±i <^_ 


(n"1)So2  V1^ 


a2(n-l) 


follows   a  t    distribution  with  n-1   degrees   of   freedom. 
Thus, 


^    Pr{XQ+t (n-1; (l-r)/2)  (l+l/n)1s  <X  +1<XQ+t (n-1;  (l+r)/2)a+l/n)'5S}=r 


(2.4) 


where  t (n-1; (l-r)/2 )  and  t (n-1 ; (l+r)/2 )  are  lower  and  upper 

100r/5  points  of  t-distribution  with  n-1  degrees  of  freedom. 

Hence,  a  two-sided  100r#  prediction  interval  to  contain 

a  single  future  observation  X  ,,  is 

n+1 
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,%< 


Xn   ±  t(n-l|;(l+r)/2)(l+l/n)\ 
o        r '  o 


To  determine  a  simultaneous  prediction  interval  to 
contain  all  k  future  observations,  first,  let 


Z.  -  Zn+^  ~  XQ  ,   i  r  1,2,3).. .sk. 


Then, the  expected  value  of  Z.  is  zero  and  the  variance  of 
Z±   is 


?2d+i, 


and  it  can  be  shown  that  cov(Z  ,Z.)  =  a   /n  for  all  i  and  j 

i J ,   - 

i  /  j.   The  transformed  variables 


^W 


1      J->'  J  J>  •••)"■ 


have  standard  normal  distributions. 

2   2  ' 

Since  (n-l)S^  /a   is  independent  of  the  Z.   and  has  a 


Chi-square  distribution  with  n-1  degrees  of  freedom,  each 
of  the  ratios 


T,  = 


(n-l)S( 
(n-l)a' 


Xn+i  -  Xo 

V1+^ 
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follows  a  student's  t-distribution  with  n-1  degrees  of 
freedom  and  the  T.  are  correlated.   The  random  variables 
T.jTpsT.-, . . .  ,T„  are  jointly  distributed  according  to  the 
multivariate  generalization  of  the  student's  t-distribution 
with  n-1  degrees  of  freedom.   Tables  of  the  percentage 
points  of  this  distribution  are  given  in  [4].   If  u  is 
defined  as  the  solution  of  the  integral  equation 

u    u        u 
r=    /     /   ■*"    /   fm   m   m       m   dTn  , dT~ , dT „ , . . . , dT 


-u   -u       -u 


T13T2,T  ,. .. ,TR  dT1>dT2'dT3'-- ■»- 


where  f   „  _        is  the  joint  probability  density 
function  of  multivariate  t-distribution  with  n-1  degrees 
of  freedom,  then  • 


Pr{X  -u(l+i)^S  <X  ,n<X  +u(l+-)i2S  ,..  .,X  -u(l+J-)^S  <X  ,.  <X  +u(l+^-)isS  }  =  r 
o    n   o  n+1  o    n   o'   '  o    n   o  n+k  o    n   o 


The  resulting  100r%  simultaneous  prediction  interval  to 

contain  the  values  X  , , ,X  ,0,X  ,_....,X  ,,  of  all  k  additional 

n+1'  n+2'  n+3      n+k 

observations  is 


D.   SOME  THEOREMS  USED  IN  DERIVING  THE  RESULTS  IN  THE  THESIS 

*  Theorem  1.   If  X  is  distributed  HCiJijC2!),  then  X'AX/o 

2  2 

is  distributed  as  x  (K,X),  where  X  =  p'Au/2a  ,  and  k  =  rank 

of  A,  if  and  only  if  A  is  idempotent. 

16 


*  Theorem  2.   If  X  is  distributed  N(y,V),  then  X'BX  is 

2 
distributed  as  x  (k,A),  where  X  =  ^u_'Bu_  and  k  is  the  rank 

of  B,  if  and  only  if  BV  is  idempotent. 

*  Theorem  3.   If  X  is  distributed  N(y_,V),  then  X'AX  and 
X'BX  are  idependent  If  and  only  if  AVB  =  0. 

*  Theorem  4.   If  X  is  distributed  N(y,V),  then  Y  =  C'X  and 
X'AX  are  independent  if  and  only  if  C'VA  =  0. 

*  Theorem  5-       (Hogg  and  Craig  theorem) 

Let  Q  =  Q-L+Qg+Q^..  .+Qk_1+Qk,  where  Q,Q1  ,Q2  ,Q3, .  .  .  ,QR_1 ,    and 
Q.  are  k+l__random  variables  that  are  quadratic  forms  in  the 
observations  of  a  random  sample  of  size  n  from  a  normal 
distribution  N(y,a2).   Let  Q/a2  be  x2 (r) ,    let  Q±/a2  be 


the  random  variables  Q, ,Q?3Qo, . . . ,Q<  are  mutually  stochas- 

2      2 
tically  independent  and,  hence,  Q, /a   is  x   (r,  =  r  -   Z   r.  ) . 

k  k       J=1  3 

*  Theorem  6.   (Baldessari  theorem) .   Let  X   be  a 

nxl 
multivariate  normal  distribution  with  mean  vector   jj   and 

nxl 
covariance  matrix   V   ,  i.e.,  N(y_,V),  and  B0,B,  ,Bp  , .  .  .  ,B, 

nxn 
be  (nxn)  idempotent  matrices  satisfying 


k  1 

£    B,   =    I    -   i   E 

j  =  0   °  nxn        nxn 


where   I   is  the  (nxn)  identity  matrix  and   E    is 

nxn  nxn 

a  (nxn)  matrix  all  of  whose  elements  are  unity.   Let 

a  be  a  positive  constant.   Then,  a  necessary  and  sufficient 
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condition  for  X'B,X/a,  j=l ,2 ,3, . . . 5k,  to  be  mutually 
Independent  and  have  non-central  Chi-square  distribution 
with  r.  (r,  =  rank  of  B.,  j=0,l ,2 , . . . ,k)  degree  of  freedom 

J       J  J 

is  that  the  covariance  matrix  V  has  the  following  structure 


V    =  I(    H   +   H'  )   +   a(    I    -   E    ) 

nxn      2    nxn     nxn  nxn     nxn 


where        H  ,  I  and  E  are   defined  on    (1.1). 

nxn  nxn  nxn 
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III.   PREDICTION  INTERVALS  TO  CONTAIN  THE  STANDARD  DEVIATIONS 
OF  FUTURE  SAMPLES  -  CORRELATED  CASE 

Hahn  [5]  derived  prediction  intervals  to  contain  the 
standard  deviations  of  future  samples  of  independent  and 
identically  distributed  random  variables  from  a  normal 
distribution  with  unknown  mean  and  unknown  standard  devia- 
tion.  In  this  chapter  we  extend  Hahn's  results  to  the  case 
where  the  samples  are  correlated  and  have  a  special  type  of 
covariance  structure. 

Section  A  deals  with  the  procedures  for  constructing  a 
prediction  interval  to  contain  the  standard  deviation  of  a 
single  future  sample  of  size  n,  observations,  based  on  a 
given  sample  of  size  n  . 

Section  B  deals  with  the  construction  of  simultaneous 
prediction  intervals  to  contain  the  standard  deviations  S. 
i  =  l,2,3j...jk  of  k  future  samples  of  sizes  n.. 

Numerical  examples  are  given  in  Section  C. 

A.   PREDICTION  INTERVAL  TO  CONTAIN  THE  STANDARD  DEVIATION 
OF  A  SINGLE  FUTURE  SAMPLE 

Let  Xn, ,Xno,X__, . . . ,Xn   be  the  values  of  a  given  sample 
01'  02'  03     'On 

o 

and  X, ,  ,X,  „  .X.,  _, .  . .  ,X,   ,  the  values  of  a  future  sample.   It 
11'  12 '  13'    '  In,  ' 

is  required  to  construct  a  prediction  interval  for  the 
standard  deviation  S.  of  the  future  sample  based  on  the 
standard  deviation  S   of  the  given  sample. 
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Let 


-      i  ni 
xi  =  q  jfa  xu 


and 


n . 
S  2  =  -~  E   (X  .  -  X, )2   where  1  =  0,1 

T-fJ 

Let 


_   !   1   "i 

X  =  rr   E    E   X.  .    where  N  =  n   +  nn 
N  i=0  j=1   U  o    1 


and 


1   ni 
2    1  —  2 

S   =  ±   E    E   (X.   -  XT 
W  1=0  j=l    1J 


denote  the  sample  mean  and  variance  of  the  combined  sample 

of  size  N  =  n  +  nn . 
o    1 

—      —     —  2 

Since  X,  =  (NX  -  n  X  )/n, ,  the  sum  of  squares  NS   can  be 

partitioned  as  follows: 

?    1   ni         P    1   ni  P 

NS   =   E    E   (X. .-X)   =   E    E  (X. .-X.+X.-X) 
i=0  j=l    1J        1=0  j=l   1J  X      X 

1   ni 
=   E    E   (X..-X, )2  +  n,(X,-X)2 
1  0  J  1 

=  noso2+nisi2+no(V?)2+ni{(,«Vo)/nr^2 

='Vo2  +  nlSl2  +  ^  (Mo)2  (3.1) 
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Expressing  the  sum  of  squares  in  (3.1)  as  quadratic  forms 
we  can  write  the  equation  as 


X'B  X  =  X'BnX  +  X'B.X  +  X'B-X  (3-2) 

—  — o—   —  —1—   —  —  d.—       —  —  3— 


where  B  ,  B..  and  B~  are  idempotent  matrices  and 


B^  =   I  -  N  1E     and    B±   =  I   -  n±  1E  ,  1=0,1. 
NxN     NxN  n1xni    n'jxrij 

If  x01»x02»xo3»*  •*  »x0n  »  xn»xi2»xi3»*  *  *  'Xln,  are 

independent  and  have  identical  normal  distributions  with 

2  2   2       2   2 

mean  y  and  variance  a    ,   then  it  is  known  that  NS  /a    ,   KS  /a 

2   2 

and  nn Sn  /a     have  Chi-square  distributions  with  N-l,  n  -1 

J.    -L.  O 

—  —  2 
and  n.-l  degrees  of  freedom  respectively  and  n  N(X-X  )  /n^ 

is  non-negative.   Thus,  Hogg  and  Craig's  theorem  (theorem  5) 

applies  to  equation  (3.1).   Therefore,  the  three  quadratic 

forms  on  the  right  hand  side  of  (3-1)  are  mutually  independent 

—  —  2     2 
and  n  N(X-X  )  /n,a  has  a  chi-square  distribution  with  1 

degree  of  freedom.   It  also  follows  that  the  matrix  B^  in 

equation  (3-2)  is  also  idempotent. 

Now,  suppose  ^  =  (X01,X02,X03,...,X0no,X11,X12,X13,...,Xlni) 

is  a  vector  random  variable  having  a  multivariate  normal 

t 

distribution  with  mean   y  =  (y,y ,y , . . .  ,y)   and  covariance 

Nxl 
matrix  V  which  has  the  following  structure. 

NxN 

V   =   I(  H   +   h')  +  a(  I   -   E  )  (3.3) 

NxN      NxN    NxN      NxN   NxN 
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where 


H 

NxN 


I      is  an  NXN  identity  matrix,   E  is  an  NXN  matrix  whose 
NXN  NXN 

elements  are  all  unity  and  a  and  h.,  i  =  1,2,3»...»N,  are 

positive  constant . 


To  obtain  a  prediction  interval  for  S-,  we  start  with 

equations  (3.1)  and  (3-2).   Since  the  matrices  B  ,  B  ,  B? 

_l 
and  B_  are  idempotent  matrices  and  Bq=   E  B.=   i   -N  E 
~3  NxN   j  =  l  _J    N3cN     N3cN 

all  the  conditions  of  the  Baldessari  theorem  (theorem  6) 


are  now  satisfied.   Therefore,  the  three  quadratic  forms  of 
(3.2)  on  the  right  hand  side  have  central  Chi-square  distri- 
butions with  n  -1,  n-i-l  and  1  degree  of  freedom  respectively 
and  are  mutually  independent. 
Thus,  the  random  variable 


X  B^X  /  X  B-i  X 


a 


a 


(nl-^Sl 
a(n1-l) 


(n  -1)S  c 
v  o    o 

a(no-l) 


follows  an  F-distribution  v;ith  n,  -1  and  n  -1  degrees  of 
freedom  and  we  obtain  a  prediction  interval  for  S.  as; 
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s2 


MF(n1-l,no-l;(l-r)/2)  <  -^  <  P(n1-l,no-l;(14r)/2)}  =  r 


So 


or  equivalently, 


& 


Pr{S0P(n1-l,n0-l;(l-r)/2)'s  <  S1  <  S^P&^-l^-l; (l+r)/2P}  =  r 

(3.4) 

where  r  is  the  chosen  confidence  coefficient  and 
F(n1-l,no-l;(l-r)/2)  and  F(n.,-l,n0-l;  (l+r)/2)  are  the 
appropriate  percentage  points  of  the  F  distribution  with 
n-,-1  and  n  -1  degrees  of  freedom.   This  yields  the  following 
two-sided  100r#  prediction  interval  to  contain  the  standard 
deviation  S,  of  n,  future  observations; 

(So[F(n0-l>ni-l;(l4r)/2)^  F^-l^-l^l+r)^)   (3-5) 


This  prediction  interval  for  S^  is  exactly  the  same  as  the 
one  obtained  by  Hahn  [5]  for  the  independent  case. 


B.   A  SIMULTANEOUS  PREDICTION  INTERVAL  TO  CONTAIN  THE 
STANDARD  DEVIATION  OF  EACH  OF  k  FUTURE  SAMPLES 

As  in  the  previous  section,  let  X^, jXn? ,Xn^. . . ,XQ 

o 
be  the  values  of  a  given  random  sample  and  let 

Xi;L,X12,X13,.  .  •>Xlni>X21'X225X23'  '  '  5X2n25X31'X32'X33'*  *  *  ,X3n> 
. . . »XjQ  X„  ,X   , ...,XK   be  the  values  of  K  sets  of  future 

samples  from  a  normal  distribution  with  unknown  mean  y  and 

unknown  standard  deviation  a. 
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Let 


-  1      ^ 

Xi   =   ~±   ^   XiJ 

ni 

Si2  =  FT     E      (Xij   "  V2' 
1  ni   j=l        1J 


where   i   =   0,1,2,3, ...  ,k   and  let 


K 
N  =      Z      n. . 
1=0      x 


Also   let 


,      K       nl 
X  -  w     Z        Z     X. , 

N   1=0   j  =  l      1J 


and 


S^   =   ±     Z        S      (X,.    -   XT 
N   1=0   3=1        1J 

K 

be   the   mean   and  variance   of  the  pooled  sample   of  N   =      Z 

1=0 
observations. 

2 
The    sum  of   squares   NS      can  be   partitioned  as 

pKi  pKi  _P 

NS     =     Z       Z      (X. .-X)     =     Z       Z     (X..-X.+X-X) 
1=0  j=l       V  1=0  j=l      1J  x 

=     Z   [   Z     (X..-X  )2  +  n  (X.-X)2] 
1=0  j=l       1J  x 

=  n  S  2+n,S 2+n0S 2+...+n„S 2+  Z  n,(X,-X)2  (3-6) 

o  O        11        2  2  I\  K      ._«  1     1 
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2   2 
It  follows  that  (N-l)S  /c   has  a  Chi-square  distribution 

2   2 
with  N-l  degrees  of  freedom  and  (n.-l)S.  /a    ,  1=0,1,2,3, ... ,K, 

have  Chl-square  distributions  with  n.-l  degrees  of  freedom 

respectively,  and  the  last  term  of  (3.6)  is  non-negative. 

Applying  Hogg  and  Craig  theorem  (theorem  5)  we  can  conclude 

that  the  last  term  of  (3.6)  also  has  a  Chi-square  distribu- 

K 
tion  with  K[(N-1)-  Z  (n.-l)  =  (N-1)-(N-(K+1) )=K]  degrees 

1=0 
of  freedom  and  that  all  the  sums  of  squares  on  the  right 

hand  side  of  equation  (3.6)  are  mutually  independent. 

Expressing  these  sums  of  squares  as  quadratic  forms  we 

can  write  equation  (3-6)  as; 


X'BX  =  X'B^X  +  X'BqX  +  X'B^X  +  . . .  +  X'U  +  X'EL^X   (3-7) 


where 


B=  I  -  N_1E    and   B .  =  I  -  n.~\  ,  1=0,1,2,..  .,K,K+1, 

NxN    NxN         x     n.xn.    n.xn, 

are  idempotent  matrices  (see  theorem  1). 

Now,  suppose  X  is  a  random  vector  having  a  multivariate 
Nxl 
normal  distribution  with  mean   y   =  (y ,y ,y, . . . ,y) '  and 

Nxl 
covariance  matrix  V  which  has  the  form  (3-3). 

NxN   „ 

The  partition  of  NS   given  in  equations  (3.6)  and  (3-7) 
are  valid  for  this  case  also.   Thus,  we  know  B  and  B. , 

1=0,1,2 ,3, ... ,K,  are  idempotent  matrices  and  B  =   I  -  N  E 

NxN      NxN 
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B  m      j      _  n  _1e    .   So,  the  conditions  of  Baldessari 

_i   n7xni    nixni 

theorem  (theorem  6)  are  all  satisfied.   Therefore,  X'BX/a 

has  a  Chi-square  distribution  with  N-l  degrees  of  freedom, 

X'BjX/a,  I"0i.l,2,3,...,K,  have  Chi-square  distributions 

with  n.-l  degrees  of  freedom  and  X'B_k+1X/a  has  a  Chi-square 

distribution  with  K  degrees  of  freedom.   Further  the  k+2 

sums  of  squares  on  the  right  hand  side  of  (3-7)  are  mutually 

independent.   Thus,  each  of  the  random  variables 

X'B.X   /x'B^X   _  (n^DS.2  /(nQ-l)So2   =   S^2 
7~    /    a        aCn^l)/    a(nQ-l)     SQ 

where  i  =  1,2,3,...,*,  follows  an  F  distribution  with  n±-l 
and  n  -1  degrees  of  freedom. 

Now,  assume  n1  =  n2  =  n3  =  ...  =  nk  =  m-   Then> 
s_2/s  2^  1=lj2>3,. .. ,K,  has  an  F  distribution  with  m-1  and 

n  -1  degrees  of  freedom. 

Define  the  random  variables 

WT (K,m-l,n  -1)  =  max  — *■ 
L        o       i  S 

o 


and 


S  2 


W„(K,m-l,n  -1)  -  min  — ? 
s        °       i   S 


i-,   1=1,2,3,-. • ,K. 
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The   distributions   of  WT(K,m-l,n   -1)    and  Wc(K,m-l ,n   -1) 

Li  O  O  O 

are  known  as  the  studentized  largest  and  studentized 

smallest  Chi-square  distributions,  respectively.   The  upper 

percentage  point  DTT(K,m-l,n  -l;r)  of  W  (K,m-l,n  -1)  and 

the  lower  percentage  point  DT  (K,m-l,n  -l;l-r)  of  WQ  (K,m-l,n  -1) 

1j  o  o         o 

were  tabulated  by  Armitage,  J.V.  and  Krishnaiah,  P.R.  and 
are  available  in  [1]. 
Then, 


Pr{WL(K,m-l,no-l)  <  DyCK.m-l ,nQ-l;r) } 

=  Pr{max  -±~   <   D  (K,m-l,n  -l;r)}  =  r 

1  S      u        ° 
o 


and 


Pr{max  S±2  <  Du(K,m-l,no-l;r)So2}  =  r. 


Thus,  an  upper  100r$  simultaneous  prediction  limit  to 
exceed  the  standard  deviations  of  all  k  future  samples 
each  of  size  m  is 


SoDu(K,m-l,n0-l;r)1'2  (3-8) 


Similarly,  a  lower  100(l-r)#  simultaneous  prediction  limit 
to  be  exceeded  by  the  standard  deviations  of  each  k  future 
samples  of  size  m  Is 
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S0DL(K,m-l,n0-l,l-r)i'2  (3-9) 


This  result  is  also  the  same  as  the  one  Hahn  [5]  obtained 
for  independent  samples. 

C.   NUMERICAL  EXAMPLES 

Suppose  a  gun  is  selected  at  random  and  fired  r\Q  =   6 
times  and  the  resulting  miss  distances  from  a  target  are 
measured.   Let  S  =1.00  be  the  standard  deviation  of  these 
observations.   A  prediction  interval  for  the  standard  devia- 
tion S  of  n,  =  10  future  attempts  is  desired. 

Then,  a  two-sided  95$  prediction  interval  to  contain 
the  standard  deviation  S1  for  a  single  future  sample  of  10 
observations  is  obtained  as  follows; 

For  n  =  10,  nQ  =  6  and  r  =  0.95,  F(9,5;-975)  =  6.68  and 
F(5,9;0.975)  =  4.48  and  SQF(9, 5  \Q.915)h  =    (1. 00)  (6. 68)"2  =  2.584 
SoF(5,9;0.975)"1'5  =  (1.  00)  (4.  48)_!l  =  0.472. 

Substituting  the  above  in  equation  (3-5)  the  required 
prediction  interval  for  S]_  is  (0.472,2.584).  Next,  an  upper 
95%    simultaneous  limit  to  exceed  the  standard  deviation  of 
all  3  future  samples  of  size  10  is; 

For  m  =  10,  n  =  6,  K  =  3  and  r  =  0.95,   Dy(3, 9,5 ;0 . 95)  -  6.4l 


o 

and  Du(3,9,5;0.95)l5S  =  (6.41)^(1.00)  =  2.762  (see  table  28 


page  41  [1]). 
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IV.   PREDICTION  INTERVALS  FOR  THE  ADDITIONAL 
OBSERVATIONS  IN  A  FUTURE  SAMPLE  - 
CORRELATED  CASE 

A  prediction  interval  to  contain  a  single  future 
observation  and  a  simultaneous  interval  to  contain  each 

of  k  additional  observations  of  a  random  sample  from  a 

p 

normal  distribution  with  mean  u  and  variance  a  were  obtained 

by  Hahn  t53«   In  this  chapter  we  extend  these  results  to  the 
case  where  the  samples  are  correlated  and  the  covariance 
matrix  has  the  form  defined  in  (3-3).   In  section  A  a 
prediction  interval  to  contain  a  single  additional  observa- 
tion based  on  correlated  observations  is  obtained,  and 
section  B  deals  with  the  construction  of  simultaneous 
prediction  intervals  to  contain  k  additional  correlated 
observations.   Numerical  examples  are  given  in  section  C- 

A.   A  PREDICTION  INTERVAL  FOR  A  SINGLE  FUTURE  OBSERVATION 
Let  X,  ,Xp  ,X,, .  .  .  ,X  be  indegend^jit  and  have  identical 
normal  distribution  with  unknown  mean  u  and  unknown 
standard  deviation  a.   It  is  required  to  construct  a 
prediction  interval  for  an  additional  observation  X  +, 
based  on  the  given  n  samples. 

Let 

1   n 
X  =  -  Z      X, 
n   n  i=1   i 
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S    =  -   E   (X,  -  X  ) 
n    n  .  -    i    n' 


,   n+1 

E   X, 


n+1   n+1  1=1  1 


2      x  n+1  2 

Sn+1  ~  H+T  1f1  (X1  "  Xn+1} 


Since 


n+1 


■    Cn+l)Xn+1  -  nXn    ,      _     x^  -  ^  ***  +  "  *»\ 


2 

the    sum  of  squares    (n+l)S  can  be   partitioned  as    follows: 

3  n+1  P        n  P  ? 

(n+1)sn+i  - 1=\  «rW  "J^^rW  +  0WrW    « 

n             -    -    -     y?  -         P 

=     E     (X.-X  +X  -X  ... )     +   (X   ,,-X  ,, )     " 
._,       i n    n    n+1  n+1    n+1 

n  '  * 

=     E     (X.-X  )2  +  n{X     -  -j=-(X  xn+nX  )}2+{X   ,,-  -^-(X  .,+nX  )} 
.   ,       in  n      n+1    n+1      n  n+1    n+1    n+1      n 

yo  X  ,,-X"    g  n(X  ,,-X  )  2 

_     c  2         ,  n+1    n>.2  ,       n+1    n  -. 

=  nS      +  n( Tn — )     +  i zt — ) 

n  n+1  n+1 

=  ns  2  +  -£y  (x  ..  -  x  )2  ^  (4.D 

n        n+1      n+1        n 

Expressing  the    sum  of   squares    in    (4.1)    as   quadratic    forms 
we   get 


V%*  =  ¥h*  +  V®2-  (i,'2) 
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where  B  and  B,  are  idempotent  matrices  and 
— o     —l 


B  =   I   -  (n+1)  1E 

(n+Dx(n+l)    (n+l7x(n+l) 


and   B  =  I  -  n  1E. 
nxn     nxn 


2     2  2   2 

Since  (n+l)S   ,/a   and  nS  /a   have  Chi-square  distribu- 
tions with  n  and  n-1  degrees  of  freedom  respectively  and 

p 

n(X  .,  -  X  )  /n+1  is  non-negative,  Hogg  and  Craig's  theorem 

(theorem  5)  applies  to  equation  (4.1).   Therefore,  the  two 

quadratic  forms  on  the  right  hand  side  of  (4.1)  are  mutually 

'  —  2       2 

independent  and  n(X  ,,  -  X  )  /(n+l)c   has  a  Chi-square 

distribution  with  1  degree  of  freedom.   It  also  follows 

that  the  matrix  B„  in  equation  (4.2)  is  also  idempotent. 

Now,  suppose  X    is  a  vector  random  variable  having 
(n+l)xl 
a  multivariate  normal  distribution  with  mean  y  =  (y  ,ii,y, . ; .  ,y) ' 

(n+l)xl 
and  covariance  matrix   V       which  has  the  following 

(n+l7x(n+l) 
structure. 


V   =  £(   H   +   H')   +   a  (   I   -   E    )  (4.3) 
(n+lTx(n+l)  (n+l7x(n+l)  (n+l)x(n+l)   (n+l7x(n+l)  (n+l)x(n+l) 


where 


H 
(n+1 )x (n+1) 


/ 


\ 


\ 


hl  hl  hl   hl 

h«   hp   hp   hp 

h-,   h_   h_  h„ 

•       •       •  • 

hn+lhn+lhn+l hn+l/ 
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vy^->or  la.        'J       Tvv>  ^-"^G 


H'  is  the  transpose  of  H,  a  and  h, ,  1=1,2, 3, . . . ,n+l  are 
positive  constant,  I  is  an  (n+l)x(n+l)  identity  matrix  and 
E  is  an  (n+l)x(n+l)  matrix  whose  elements  are  all  unity. 
Since  the  matrices  B~ ,  B, ,  and  B?  in  equation  (4.2)  are 
idempotent  and 


2 

(n+l)x(n+l)     *     -j   =   (n+l)~x(n+l)  ~  (n+1)  (n+l)x(n+l) 


we  may  apply  the  Baldessari  theorem  (Theorem  6)  to  equation 
(4.1)  to  conclude  that  the  two  quadratic  forms  on  the  right 
hand  side  of  the  equations  have  central  Chi-square  distri- 
bution With  n-1  and  1  ^gnjPg^nf^ppgrlr.TTi  T^gppnf-,1  y_gTj^_  an H 

are  mutually  independent. 

Thus,  the  random  variable 


X'B2X  /    X'B 
"a   /    "a 


siS.  .   (HTT»*n+l-*n)2  /  (";1)Sn2 

a         /    a(n-l) 


has  an  F  distribution  with  1  and  n-1  degree  of  freedom. 
We  obtain  a  prediction  interval  for  X  .,  as 

Pr{F(l,n-l;(l-r)/2)  <  (^)(Xn+1-x//sn2  <  F(l,n-l;(l+r)/2)}  =  r 
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or  equivalently 

Pr{VSn(l  +^)^(lJn-l;(l-r/2)1'5 < X^  < 

<  ^1+Sn(l  +  i)l%(l,n-l;(l+r)/2)Js}     =     r 

where  r  is  the  chosen  confidence  coefficient  and 
F(l,n-l;(l+r)/2)  and  F(l,n-1; (1+r )/2)  are  the  appropriate 
percentage  points  of  the  P  distribution  with  1  and  n-1 
degree  of  freedom. 

Now  recall  that   F(l,n-1;  (l-r)/2)*  =  t  (n-1;  (l-r)/2) 


and  P(l,n-l;(l+r)/2)Js  = t (n-1; (l+r)/2) 


This  yields  the  following  two-sided  prediction  interval 

to  contain  the  additional  observation  X  , . : 

n+1 

(Xn+Sn(l  +  ^-)lit(n-l;(l-r)/2),X^+Sn(l  +  I)i5t(n-l;(l+r)/2))   (4.4) 


B.   SIMULTANEOUS  PREDICTION  INTERVALS  FOR  k 
FUTURE  OBSERVATIONS 

Let  X i ,Xp ,X,, . . . ,X  be  the  values  of  a  given  sample  and 

x  xt  JX  j.->>X  ....... X  .,  ,  the  values  of  k  future  observations 

n+1'  n+2'  n+3      n+k' 

V/e  assume  that  the  sample  observations  X,  ,Xp  ,X_ , .  .  .  ,X  , 
X  +,,X  p,...,X  ..  are  correlated  and  have  a  multivariate 
normal  distribution  with  mean  y  =  (y  ,y  ,y , .  .  .  ,u) '  and 
covariance  matrix  V  which  has  the  form  (4.3). 
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In  order  to  construct  a  simultaneous  prediction  interval 
for  Xn+l,Xn+2'Xn+3'"  * 'Xn+k  we  first  establish  that 

n   (Xi"Xn)2   nS2 
(i)    E   — —  =  -^—       has  a  Chi-square  distribution 

1-1     a        a 


with  n-1  degree  of  freedom, 


(ii)  the  vector  variable  Z_  =  (  Z  ,Z„  ,Z_  , .  .  .  ,Z,  )  '  ,  where 

Z*    =  X  , .  -  X   ,  has  a  multivariate  normal  distribu- 
i    n+i   n  ' 

tion  and 
(ii3)  the  vector  variable  Z_  and  nS  are  statistically 
independent . 

2    n       _  2 
If  nS   =   E   (X.  -X  )   is  expressed  as  a  quadratic  form 

1=1   x   n 

X  BX,  where  Xr  (X^jX2>X-5j...  5^-n'^j^+i  '^n+2  ' '  '  '  'Xn+k  ' 

then  a  necessary  and  sufficient  condition  for  X'BX  to  have 
a  Chi-square  distribution  is  that  BV  is  idempotent  (see 
Theorem  2) . 

To  show  that  BV  is  idempotent,  let  the  matrices  H,  H ' , 
and  E  (^.3)  and  the  matrix  V  be  partitioned  as  follows: 


34 


H  = 


/*■ 

\ 

it.   -    ' 

"i 

h1-    •    • 

hl  \ 

h 

h2 

t\    •    •    « 

h2 

h2-     •    • 

h2 

/  h3 

• 

h3 

• 

h      •     *     * 

3 

• 

h3 

• 
• 

h     *     •     * 
3 

• 
• 

h3 

• 

h 
n 

• 

h 
n 

• 

h     •    •    • 

n 

• 

hn 

• 

h     •    •    • 
n 

• 

h 
n 

Vl  hn+l  Vl 


n+k  n+k  n+k 


n+1 


n+k 


n+1 


n+k 


n 


J  ^r 


n+1 


n+k 


"l 

nxn 


\  km 


nxk 


H4 
kxk 


H»  = 


/  ill" 

nxn 

nxk 

kxn 

a,' J 

kxk    / 
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and 


1 

1 
1 


1 
1 


1  • 

1  . 

1  . 


E  = 


n 


1  .  . 


1 
1 


nxn 


^3 

kxn 


*2 

nxk 


E4 
kxk 


V  = 


nxn 


^3 

kxn 


^2 
nxk 


In 

kxk 


/ 
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h 

nxn 


\ 


I.  = 


in 

kxk 


where  V1  =  g- %+£■,/)  +  aCl^-E^) 


and 


V2  =  ^(H2+H3')  +  a(0-E2) 


We  know 


B* 

(nxk)xTn+k) 


Si 

nxn 

I 

0 

° 

where  B-,  =  —  (   I 
—1   a    — 
nxn       nxn 


n"1  E    ) 


nxn 


'  5i 


and     BV 


/  2i 


/  u 


V 


I    B^ 


5A\ 


v, 


4    \ 


(4.5) 
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In   equation    (4.5),    B.V,    and  — nX?    can  be   simplified  as 
follows: 


B1V1   =   ^(I1-n*1E1){%(H1+H1«)    +    oC^-^)} 


-  £<%   +   ^    +   a^   -   a^   -  ^g^-  ^H^ 


"  fel1!   +  n^lV 


-  |(%  +   ^i'    +   «ii  "   a*l   ~  s£l  "  55?! 


n-l        n  -1 


=    (I     -1e,)+1(L    --E) 
—1        n  — 1  2a     —1        n  — 


n 

where   a  =      E      h 


1=1      1 


and  B^   =   ^(Ij-n   1E]L)  {JsCH^+H-* )    +   a(0-E_2)} 

"   |(^   +   ^3'    "  °^>   -   2^2   "   2l£lV    +  fel^' 


IC^    +    yj3'    -   aE2    -  ^E2    -   ^    +   HnE2) 


=      n —  (Hn      —     — E  -  ) 

2a  —2        n— 2 
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Thus 


/  hXi 


BV  = 


\        ° 


*iV 


But 


ii-n~V&Si-&i5 


K<5a-k>>  \ 


\ 


and 


/  *ih 


(BV)(BV)    = 


^    ^ 


/   B^ 


0        / 


5lM 


o     ./ 


(*iXiy 


(B1V1)(B1V2)  \ 


(4.6) 


lh%L* 


-      {(I.-n-1^)    +^(H1-|E1)}; 
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h  -  hi  +  fchrHi-hA  +  Mih) 

n 


+  H^Si-l^i-ISi^+^EA5 


4a  n 


II   -H-E.^C^-lE.-lE^^ny 


+  sr(ai-|=i-&5i  +  ^i«i) 


4a  n 


I.    -  -  E.    +  ^r-(Kn  --  E,  ) 
— 1        n  —1        2a  —1     n  —1 


and  (B^ )  (B  Ta )    =    ( ( ^  4*1 )   +   2^(*1  "  I  =1 ) } 


x^(H2-|E2)} 


^HI.-^E,)^   +   ^-lE^Hg 


a 


-    2:(T      _i-En  )E 


nv-l      n-1   -2        2an  -1     n-l'-2 


1    '(Hn  -|E-,)E0} 


E„  --  E, 


■*-     ru  fL  v    +   —    h 

2a      -2        n   -2      2a  -2      2an   -2      n 

+   an  E.  -,gL  H^   +    a  n„   E,} 


n 


2   -2      2an   -2 


2  an 


2    -2' 


2a      —2        n   — 2 / 
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Therefore 


ii-fei  +  ?5%-|ii> 


(BVMBV) 


^5a-SSa)  \ 


I 


BV 


Since  BV  is  idempotent,  X'BX/a  has  a  chi-square  distribution 
with  n-1  degree  of  freedom. 

Next,  Z_  =  (Z..  ,Z_,Z_, .  .  .  ,Z,  ) '  can  be  expressed  as 


kxl 


C'      X 

kx(n+k)  (n+k)xl 


where    C  = 


(  —  E  ,  I   ) 
n  kxn   kxk 


Z_  has  a  multivariate  normal  distribution  with  mean  C'y_  =  0_ 
and  covariance  matrix  C'VC  as  shown  below: 


O'y  =  (  =±  E,I)     y     =  y(  Zi  n  +  l)  =  0 


n 


kxn   kxk    (k+n)xl 


n 


HI 


-1 


C'V  =    (    —     E       ,      I    ) 


n 


kxn        kxk 


»Hj 


\h 


u\ 


2\  'Si' 

+ 


H 


4  J 


I    V 


*3'\ 


3<' 


ii 


0    \ 


-   a 


A  I 


h 


M 


E4/ 


x" 


1 
2 


(—  +  h    ..  )E 

n  n+1  - 

(Z&   +  h    ,0)E 

v  n  n+2  — 


(—  +   h    ..  )E 
v  n  n+k  — 


lx(n+k) 


\    {^VV*    '    (irVh2^   ---(lThn+k+hn+k^ 
L  kxl  kxl  kxl 


a(  =±     E        ,  I    )    -   o(   =±  n  +   1)        E 

n  , \rv\r  ■    IfYlT 


kxn 


kxk 


kxn+k 
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1 

2 


<TT  +  hn+l}* 


(—  +  h  ..  )E 
v  n     n+k  — 


+  a(  =±     E  ,   I   )    (4.7) 


n 


kxn  kxk 


lxn+k 


m      /^+h 


n-KL^ 


C'VC  =  J    f 


(—  +  h  X0)E 
n     n+2  — 


(—  +  h  ..  )E 
v  n     n+k  — 


+  «(#  E  ,   I   ) 
kxn  kxk 


lx(k+n) 


/ 


if    ■ 

nxk 


\   kxk   / 


^T^l^?  +  (T+hn+l)]^ 


".Tp'W^  +  ^2^ 


[(—  +h  ..  )(— )  +  (— +h  ,.  )]E 
L  v  n   n+k   n      n   n+k  — 

lxk 


+  a(-^-  E   E  +  I  ) 

n   kxk  kxk  kxk 
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=  o( 


n  kxk 


I    ) 

kxk 


a 


k+1 

1 
n 

1 
n 

1 

n 

i+1 

n 

1 

n 

1 

n 


1 

n 


1 
n 

1 
n 

• 

1 
n 


n   n 


Therefore,  Z~  N(0,  a(  -E   +   I_  )   ) 

Also,  Z_  =  C'X  and  X'BX  are  statistically  independent  since 

(see  Theorem  4) 


c'vb  =  i  -i 


<1T*W  £ 


(— +h  x0)   E 

n   n+2   — 


(— +h  ..  )   E 
n   n+k   — 


lx(k+n) 


+  *c  =r    e   ,    i  )  y 

kxn    kxk 


I    - 

nxn 

n-1     E 

nxn 

0 

0 

0 

\ 

/ 


HH 


1 

2 


--a. 


-  —  E 
n  kxn 


.-a 


v  n   n+k   — 


lxn 


_1_ 

2n 


(^+hn  +  p)nE 


(^-+h  ..  )nE 
v  n   n+k  — 

lxn 


"%  n  E 
n   kxn 


=   0 


Thus,  each  Z±3    1=1,2 ,3, . . . ,k,  is  normally  distributed 
with  mean  0  and  variance  o(l  +  1/n)  and  is  independent 
of  S2. 
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Let  Zi ' ,  1=1,2,3, ... ,k,  be  the  standardized  variables 
defined  by 


7  '  = 


Z.  -  0 


(a(l  +  ±)}h 


Xn+1  -  X 
{a(l  +  ±))h 


the  variables 


T      = 

V 

X    .  .    -  X 

n+i 

\|(n-l)S2 
V     a(n-l) 

8(1  +  ±)h 

1  J-j<-)  jj  •  •  •  jK 


are  jointly  distributed  according  to  the  multivariate 
generalization  of  the  Student  t-distribution  with  n-1  degree 
of  freedom  and  correlation  matrix  I  defined  by 


Z   = 


1 

1 

n+1 

n+1 

1 

1 
n+1 

1 

1 

n+1 


n+1 
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To  find  a  two-sided  100r#  simultaneous  prediction 
interval  to  contain  each  of  k  additional  observations,  let 
U  be  such  that 


U   U      U 

_u  -U     -U    li>123  y"'3lk    1    J    K 

dt-,dtp. .  .dt, 

(4.8) 


Then 


X   —  X                     X   -  X 
Pr{-U  <  -2il <  u  and  . . . ,  and  -U  <  -^ <  U}  =  r 


The  resulting  100r$  simultaneous  prediction  interval  to 
contain  the  values  X  _ ,X  +  ~,X    , . . . ,X  ,  of  all  k  future 
observations  is 


X  ±  U(l  +  ±)h   S   .  (4.9) 


For  selected  values  of  r,  the  values  of  U  to  satisfy  the 
equation  (4.8)  were  tabulated  by  Hahn  and  are  available 

in  m. 

C.   NUMERICAL  EXAMPLES 

Based  upon  a  random  sample  of  observations  from  a  normal 
distribution  whose  mean  and  standard  deviation  are  unknown, 
the  following  data  is  obtained. 

51.4,   49.5,   48.7,   49.3  and  51.6 
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Prom  the  data,  the  sample  mean  X  and  sample  standard 
deviation  S  are  calculated  as 

x  =  (51. *»  +  **9.5  +  48.7  +  ^9.3  +  51. 6)/5  =  50.10 


and     S2  =  {(51. 4-50. I)2  +  (49. 5-50. I)2  +  (48. 7-50. I)2 
+  (49. 3-50. I)2  +  (51.6-50.1)2}/5 


=  6.9/5  =  1.38 


S  =  1.175 


Then,  a  two-sided  prediction  interval  to  contain  a  single 
future  observation  X  +,  with  95ft  probability  is  (see 
equation  (4.4): 

For  n=5,  r=0.955  from  the  Student's  t-tables 
t(4, 0.975)  =  2.776.   Substituting  the  observed  values 
in  (4.4)  a  95$  prediction  interval  for  X  .,  a  future 
observation  is  given  by 
(46.527,  53.673) 

Next,  a  two-sided  95%  simultaneous  prediction  interval  to 
contain  each  of  10  future  observations  is  obtained  using 
equation  (4.9)  : 

For  k=10,  n=5  and  r=0.95  from  the  tables  in  [4] 
U(l+^-)1'1  =  5.23.   Thus  X  ±  Ud+pjO^S  =  50.1  ±  5.23(1.175) 
and  the  required  prediction  interval  is  given  by 
(43.855,  56.145) 
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